Using Lightweight Procedures to Improve Instruction Cache Performance
نویسندگان
چکیده
Instruction cache performance is widely recognized as a critical component of the overall performance of a program; especially so in the case of large applications like database servers. In this report, we present a technique for (1) identifying repeated blocks of instructions in a program executable, and (2) converting these repeated code blocks into lightweight procedures (i.e. LWprocs). The use of LWprocs reduces the static code size of a program, and can potentially reduce the working set size of the process, at the cost of increasing its dynamic instruction count. However, the tradeoo seems to be in favor of the reduction in working set size for most programs. Even with a simple model of program structure and a straightforward technique for generating LWprocs, we nd performance improvements between 3% to 9% for programs in the SPECINT95 suite. However, the technique sometimes leads to slowdowns (between 5% and 27%) for some programs, suggesting that lightweight procedures should be used with care.
منابع مشابه
Design and Implementation of a Lightweight Dynamic Optimization System
Many opportunities exist to improve micro-architectural performance due to performance events that are difficult to optimize at static compile time. Cache misses and branch mis-prediction patterns may vary for different micro-architectures using different inputs. Dynamic optimization provides an approach to address these and other performance events at runtime. This paper describes a software s...
متن کاملTemporal-Based Procedure Reordering for Improved Instruction Cache Performance
As the gap between memory and processor performance continues to grow, it becomes increasingly important to exploit cache memory effectively. Both hardware and software techniques can be used to better utilize the cache. Hardware solutions focus on organization, while most software solutions investigate how to best layout a program on the available memory space. In this paper we present a new l...
متن کاملCombining Instruction Prefetching with Partial Cache Locking to Improve WCET in Real-Time Systems
Caches play an important role in embedded systems to bridge the performance gap between fast processor and slow memory. And prefetching mechanisms are proposed to further improve the cache performance. While in real-time systems, the application of caches complicates the Worst-Case Execution Time (WCET) analysis due to its unpredictable behavior. Modern embedded processors often equip locking m...
متن کاملAnalysis of Temporal-Based Program Behavior for Improved Instruction Cache Performance
ÐIn this paper, we examine temporal-based program interaction in order to improve layout by reducing the probability that program units will conflict in an instruction cache. In that context, we present two profile-guided procedure reordering algorithms. Both techniques use cache line coloring to arrive at a final program layout and target the elimination of first generation cache conflicts (i....
متن کاملEecient Procedure Mapping Using Cache Line Coloring
As the gap between memory and processor performance continues to widen, it becomes increasingly important to exploit cache memory e ectively. Both hardware and software approaches can be explored to optimize cache performance. Hardware designers focus on cache organization issues, including replacement policy, associativity, line size and the resulting cache access time. Software writers use va...
متن کامل